AITopics | beta regression

Collaborating Authors

beta regression

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lasso Penalization for High-Dimensional Beta Regression Models: Computation, Analysis, and Inference

Ramezani, Niloofar, Slawski, Martin

arXiv.org Machine LearningJul-29-2025

Beta regression is commonly employed when the outcome variable is a proportion. Since its conception, the approach has been widely used in applications spanning various scientific fields. A series of extensions have been proposed over time, several of which address variable selection and penalized estimation, e.g., with an $\ell_1$-penalty (LASSO). However, a theoretical analysis of this popular approach in the context of Beta regression with high-dimensional predictors is lacking. In this paper, we aim to close this gap. A particular challenge arises from the non-convexity of the associated negative log-likelihood, which we address by resorting to a framework for analyzing stationary points in a neighborhood of the target parameter. Leveraging this framework, we derive a non-asymptotic bound on the $\ell_1$-error of such stationary points. In addition, we propose a debiasing approach to construct confidence intervals for the regression parameters. A proximal gradient algorithm is devised for optimizing the resulting penalized negative log-likelihood function. Our theoretical analysis is corroborated via simulation studies, and a real data example concerning the prediction of county-level proportions of incarceration is presented to showcase the practical utility of our methodology.

artificial intelligence, machine learning, regression, (18 more...)

arXiv.org Machine Learning

2507.20079

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Virginia > Richmond (0.04)

Genre: Research Report (0.82)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Law (0.93)
Health & Medicine > Health Care Providers & Services (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.83)

Add feedback

Testing Hypotheses of Covariate Effects on Topics of Discourse

Phelan, Gabriel, Campbell, David A.

arXiv.org Machine LearningJun-9-2025

We introduce an approach to topic modelling with document-level covariates that remains tractable in the face of large text corpora. This is achieved by de-emphasizing the role of parameter estimation in an underlying probabilistic model, assuming instead that the data come from a fixed but unknown distribution whose statistical functionals are of interest. We propose combining a convex formulation of non-negative matrix factorization with standard regression techniques as a fast-to-compute and useful estimate of such a functional. Uncertainty quantification can then be achieved by reposing non-parametric resampling methods on top of this scheme. This is in contrast to popular topic modelling paradigms, which posit a complex and often hard-to-fit generative model of the data. We argue that the simple, non-parametric approach advocated here is faster, more interpretable, and enjoys better inferential justification than said generative models. Finally, our methods are demonstrated with an application analysing covariate effects on discourse of flavours attributed to Canadian beers.

covariate, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2506.0557

Country:

Asia > India (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Handling bounded response in high dimensions: a Horseshoe prior Bayesian Beta regression approach

Mai, The Tien

arXiv.org Machine LearningMay-29-2025

Bounded continuous responses -- such as proportions -- arise frequently in diverse scientific fields including climatology, biostatistics, and finance. Beta regression is a widely adopted framework for modeling such data, due to the flexibility of the Beta distribution over the unit interval. While Bayesian extensions of Beta regression have shown promise, existing methods are limited to low-dimensional settings and lack theoretical guarantees. In this work, we propose a novel Bayesian approach for high-dimensional sparse Beta regression framework that employs a tempered posterior. Our method incorporates the Horseshoe prior for effective shrinkage and variable selection. Most notable, we propose a novel Gibbs sampling algorithm using Pólya-Gamma augmentation for efficient inference in Beta regression model. We also provide the first theoretical results establishing posterior consistency and convergence rates for Bayesian Beta regression. Through extensive simulation studies in both low- and high-dimensional scenarios, we demonstrate that our approach outperforms existing alternatives, offering improved estimation accuracy and model interpretability. Our method is implemented in the R package ``betaregbayes" available on Github.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Machine Learning

2505.22211

Country:

South America > Colombia (0.04)
North America > United States > Iowa (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

BERT-Beta: A Proactive Probabilistic Approach to Text Moderation

Tan, Fei, Hu, Yifan, Yen, Kevin, Hu, Changwei

arXiv.org Artificial IntelligenceSep-17-2021

Text moderation for user generated content, which helps to promote healthy interaction among users, has been widely studied and many machine learning models have been proposed. In this work, we explore an alternative perspective by augmenting reactive reviews with proactive forecasting. Specifically, we propose a new concept {\it text toxicity propensity} to characterize the extent to which a text tends to attract toxic comments. Beta regression is then introduced to do the probabilistic modeling, which is demonstrated to function well in comprehensive experiments. We also propose an explanation method to communicate the model decision clearly. Both propensity scoring and interpretation benefit text moderation in a novel manner. Finally, the proposed scaling mechanism for the linear model offers useful insights beyond this work.

moderation, proceedings, toxicity propensity, (13 more...)

arXiv.org Artificial Intelligence

2109.08805

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Exploring Topic-Metadata Relationships with the STM: A Bayesian Approach

Schulze, P., Wiegrebe, S., Thurner, P. W., Heumann, C., Aßenmacher, M., Wankmüller, S.

arXiv.org Machine LearningApr-6-2021

Topic models such as the Structural Topic Model (STM) estimate latent topical clusters within text. An important step in many topic modeling applications is to explore relationships between the discovered topical structure and metadata associated with the text documents. Methods used to estimate such relationships must take into account that the topical structure is not directly observed, but instead being estimated itself. The authors of the STM, for instance, perform repeated OLS regressions of sampled topic proportions on metadata covariates by using a Monte Carlo sampling technique known as the method of composition. In this paper, we propose two improvements: first, we replace OLS with more appropriate Beta regression. Second, we suggest a fully Bayesian approach instead of the current blending of frequentist and Bayesian methods. We demonstrate our improved methodology by exploring relationships between Twitter posts by German members of parliament (MPs) and different metadata covariates.

beta regression, regression, topic proportion, (13 more...)

arXiv.org Machine Learning

2104.02496

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback